Goto

Collaborating Authors

 free data selection


Supplementary Materials for the Paper " Towards Free Data Selection with General-Purpose Models " Anonymous Author(s) Affiliation Address email

Neural Information Processing Systems

In this supplementary material, we first explain the details of spectral clustering algorithm in Sec. B. We also analyze the sensitivity of FreeSel to the values of hyperparameters in3 Sec. C. Besides, FreeSel is compared with other intuitive baselines using the general-purpose model4 in Sec. D. Finally, implementation details of our experiments are explained in Sec. E. Our code will5 be made publicly available.6 In this section, we explain the spectral clustering algorithm [14, 18] in the semantic pattern extraction8 process for each image I (Sec.


Towards Free Data Selection with General-Purpose Models

Neural Information Processing Systems

A desirable data selection algorithm can efficiently choose the most informative samples to maximize the utility of limited annotation budgets. However, current approaches, represented by active learning methods, typically follow a cumbersome pipeline that iterates the time-consuming model training and batch data selection repeatedly. In this paper, we challenge this status quo by designing a distinct data selection pipeline that utilizes existing general-purpose models to select data from various datasets with a single-pass inference without the need for additional training or supervision. A novel free data selection (FreeSel) method is proposed following this new pipeline. Specifically, we define semantic patterns extracted from inter-mediate features of the general-purpose model to capture subtle local information in each image. We then enable the selection of all data samples in a single pass through distance-based sampling at the fine-grained semantic pattern level.


Towards Free Data Selection with General-Purpose Models

Neural Information Processing Systems

A desirable data selection algorithm can efficiently choose the most informative samples to maximize the utility of limited annotation budgets. However, current approaches, represented by active learning methods, typically follow a cumbersome pipeline that iterates the time-consuming model training and batch data selection repeatedly. In this paper, we challenge this status quo by designing a distinct data selection pipeline that utilizes existing general-purpose models to select data from various datasets with a single-pass inference without the need for additional training or supervision. A novel free data selection (FreeSel) method is proposed following this new pipeline. Specifically, we define semantic patterns extracted from inter-mediate features of the general-purpose model to capture subtle local information in each image.